Towards Efficient Indexing of Arbitrary Similarity

نویسندگان

Tomáš Bartoš

Tomáš Skopal

Juraj Moško

چکیده

The popularity of similarity search expanded with the increased interest in multimedia databases, bioinformatics, or social networks, and with the growing number of users trying to find information in huge collections of unstructured data. During the exploration, the users handle database objects in different ways based on the utilized similarity models, ranging from simple to complex models. Efficient indexing techniques for similarity search are required especially for growing databases. In this paper, we study implementation possibilities of the recently announced theoretical framework SIMDEX, the task of which is to algorithmically explore a given similarity space and find possibilities for efficient indexing. Instead of a fixed set of indexing properties, such as metric space axioms, SIMDEX aims to seek for alternative properties that are valid in a particular similarity model (database) and, at the same time, provide efficient indexing. In particular, we propose to implement the fundamental parts of SIMDEX by means of the genetic programming (GP) which we expect will provide highquality resulting set of expressions (axioms) useful for indexing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Universal Indexing of Arbitrary Similarity Models

The increasing amount of available unstructured content together with the growing number of large non-relational databases put more emphasis on the content-based retrieval and precisely on the area of similarity searching. Although there exist several indexing methods for efficient querying, not all of them are best-suited for arbitrary similarity models. Having a metric space, we can easily ap...

متن کامل

Optimizing Hashing Functions for Similarity Indexing in Arbitrary Metric and Nonmetric Spaces

A large number of methods have been proposed for similarity indexing in Euclidean spaces, and several such methods can also be used in arbitrary metric spaces. Such methods exploit specific properties of Euclidean spaces or general metric spaces. Designing generalpurpose similarity indexing methods for arbitrary metric and non-metric distance measures is a more difficult problem, due to the vas...

متن کامل

Efficient Similarity Search for Time Series Data Based on the Minimum Distance

We address the problem of efficient similarity search based on the minimum distance in large time series databases. Most of previous work is focused on similarity matching and retrieval of time series based on the Euclidean distance. However, as we demonstrate in this paper, the Euclidean distance has limitations as a similarity measurement. It is sensitive to the absolute offsets of time seque...

متن کامل

Shock-Based Indexing into Large Shape Databases

This paper examines issues arising in applying a previously developed edit-distance shock graph matching technique to indexing into large shape databases. This approach compares the shock graph topology and attributes to produce a similarity metric, and results in 100% recognition rate in querying a database of approximately 200 shapes. However, indexing into a significantly larger database is ...

متن کامل

Hierarchical Bitmap Index: An Efficient and Scalable Indexing Technique for Set-Valued Attributes

Set-valued attributes are convenient to model complex objects occurring in the real world. Currently available database systems support the storage of set-valued attributes in relational tables but contain no primitives to query them efficiently. Queries involving set-valued attributes either perform full scans of the source data or make multiple passes over single-value indexes to reduce the n...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Towards Efficient Indexing of Arbitrary Similarity

نویسندگان

چکیده

منابع مشابه

Universal Indexing of Arbitrary Similarity Models

Optimizing Hashing Functions for Similarity Indexing in Arbitrary Metric and Nonmetric Spaces

Efficient Similarity Search for Time Series Data Based on the Minimum Distance

Shock-Based Indexing into Large Shape Databases

Hierarchical Bitmap Index: An Efficient and Scalable Indexing Technique for Set-Valued Attributes

عنوان ژورنال:

اشتراک گذاری